2D Convolution example using global and shared memory #2228

mehmetyusufoglu · 2024-01-22T11:43:47Z

An example: A 2D Convolutional filter applied to a matrix. The values of filter-matrix were initially kept in constant memory at the first commit. But due to Gitlab pipeline error "The SYCL backend does not support global device constants"; in the second commit, constant memory usage has been removed.

Kernel1: Global memory is used, without tiling.
Kernel2: Uses tiling. Block size is assumed to be equal to tile size. First, the tile is copied to shared memory, since an element in the tile would be accessed many times. Each block works on the domain of one tile. But at the border of the tile, some external matrix values are needed ( at the border with another tile) then those matrix values are taken from the global memory.

psychocoderHPC · 2024-02-05T09:32:24Z

example/convolution2D/src/convolution2D.cpp

+        // Allocate shared memory
+        auto* const sharedN = alpaka::getDynSharedMem<TElem>(acc);
+        // Fill shared memory of device so that tile items are accessed from shared memory
+        if(row < matrixHeight && col < matrixWidth && blockThreadIdx1D < blockThreadExtent.prod())


On the host side you use getValidWorkDiv This means you will have one thread for some alpaka accelerators.
I know you wrote that the block size must be equal to the tile size but you do not enforce it e.g. with an ALPAKA_VERIFY
If you have only one thread in the block you can simply iterate over the shared memory to fill it.

Yes but if there is one thread per block it means it is not a GPU (or it is not good to use GPU); so we dont know which level of memory is used ?

If there is one block, whole block is loaded into the shared memory in the code.

example/convolution2D/src/convolution2D.cpp

mehmetyusufoglu marked this pull request as draft January 22, 2024 11:44

mehmetyusufoglu force-pushed the convolution2DExample branch 2 times, most recently from 9c27efd to aa3efcc Compare January 24, 2024 10:14

mehmetyusufoglu changed the title ~~[Wip] 2D Convolution example using global and shared memory~~ 2D Convolution example using global and shared memory Jan 24, 2024

mehmetyusufoglu marked this pull request as ready for review January 24, 2024 10:36

mehmetyusufoglu force-pushed the convolution2DExample branch 3 times, most recently from dc160f3 to 2b93807 Compare January 24, 2024 17:16

mehmetyusufoglu marked this pull request as draft January 24, 2024 22:06

mehmetyusufoglu force-pushed the convolution2DExample branch from 80dad53 to d16a616 Compare January 25, 2024 16:22

psychocoderHPC added the Type:Example label Jan 25, 2024

mehmetyusufoglu force-pushed the convolution2DExample branch from bd30c0e to a05dabc Compare January 26, 2024 10:12

mehmetyusufoglu marked this pull request as ready for review January 26, 2024 10:12

mehmetyusufoglu force-pushed the convolution2DExample branch 9 times, most recently from ae50779 to a8ac8e6 Compare January 28, 2024 22:57

psychocoderHPC added this to the 1.2.0 milestone Jan 29, 2024

psychocoderHPC requested changes Feb 5, 2024

View reviewed changes

psychocoderHPC requested changes Feb 21, 2024

View reviewed changes

example/convolution2D/src/convolution2D.cpp Outdated Show resolved Hide resolved

psychocoderHPC requested changes Feb 21, 2024

View reviewed changes

example/convolution2D/src/convolution2D.cpp Outdated Show resolved Hide resolved

example/convolution2D/src/convolution2D.cpp Outdated Show resolved Hide resolved

mehmetyusufoglu force-pushed the convolution2DExample branch from 91b7f51 to 4673202 Compare February 22, 2024 09:29

Convolution2D filter example using global and shared memory

06f9ace

mehmetyusufoglu force-pushed the convolution2DExample branch from 4673202 to 06f9ace Compare February 27, 2024 09:54

psychocoderHPC approved these changes Feb 28, 2024

View reviewed changes

psychocoderHPC merged commit 6116586 into alpaka-group:develop Feb 28, 2024
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2D Convolution example using global and shared memory #2228

2D Convolution example using global and shared memory #2228

mehmetyusufoglu commented Jan 22, 2024 •

edited

Loading

psychocoderHPC Feb 5, 2024

mehmetyusufoglu Feb 5, 2024 •

edited

Loading

mehmetyusufoglu Feb 8, 2024

2D Convolution example using global and shared memory #2228

2D Convolution example using global and shared memory #2228

Conversation

mehmetyusufoglu commented Jan 22, 2024 • edited Loading

psychocoderHPC Feb 5, 2024

Choose a reason for hiding this comment

mehmetyusufoglu Feb 5, 2024 • edited Loading

Choose a reason for hiding this comment

mehmetyusufoglu Feb 8, 2024

Choose a reason for hiding this comment

mehmetyusufoglu commented Jan 22, 2024 •

edited

Loading

mehmetyusufoglu Feb 5, 2024 •

edited

Loading